Incremental Parser Generation for Tree Adjoining Grammars
نویسنده
چکیده
This paper describes the incremental generation of parse tables for the LRtype parsing of Tree Adjoining Languages (TALs). The algorithm presented handles modi cations to the input grammar by updating the parser generated so far. In this paper, a lazy generation of LR-type parsers for TALs is de ned in which parse tables are created by need while parsing. We then describe an incremental parser generator for TALs which responds to modi cation of the input grammar by updating parse tables built so far. 1 LR Parser Generation Tree Adjoining Grammars (TAGs) are tree rewriting systems which combine trees with the single operation of adjoining. (Schabes and Vijay-Shanker, 1990) describes the construction of an LR parsing algorithm for TAGs. Parser generation here is taken to be the construction of LR(0) tables (i.e., without any lookahead) for a particular TAG. The moves This work is partially supported by NSF grant NSFSTC SBR 8920230 ARPA grant N00014-94 and ARO grant DAAH04-94-G0426. Thanks to Breck Baldwin, Dania Egedi, Jason Eisner, B. Srinivas and the three anonymous reviewers for their valuable comments. Familiarity with TAGs and their parsing techniques is assumed throughout the paper, see (Schabes and Joshi, 1991) for an introduction. We assume that our de nition of TAG does not have the substitution operation. See (Aho et al., 1986) for details on LR parsing. The algorithm described here can be extended to use SLR(1) tables (Schabes and Vijay-Shanker, 1990). made by the parser can be explained by an automaton which is weakly equivalent to TAGs called Bottom-Up Embedded Pushdown Automata (BEPDA) (Schabes and Vijay-Shanker, 1990). Storage in a BEPDA is a sequence of stacks, where new stacks can be introduced above and below the top stack in the automaton. Recognition of adjunction is equivalent to the unwrap move shown in Fig. 1.
منابع مشابه
Negotiation Strategies in an Integrated Natural Language Generation System
This paper describes negotiation strategies in an integrated generation system (INLGS) based on the formalism of Schema–Tree Adjoining Grammars with Unification (SU–TAGs). Integrated or uniformmeans that all knowledge bases are specified in the same formalism and run the same processing algorithm. In our project a reversible parser/generator runs knowledge bases in the formalism of Schema– Tree...
متن کاملParsing with Underspecifications
This paper describes a direct parser for Schema–Tree Adjoining Grammars (S–TAG) which explores schemata, i.e. underspecified elementary rules. Basically, a schema in a S–TAG represents a possibly infinite set of elementary rules by folding up all actual substructures and depicting them in terms of a regular expression (RX). Hence, S–TAGs provide a more condensed grammar representation. In the f...
متن کاملTuLiPA - Parsing Extensions of TAG with Range Concatenation Grammars
In this paper we present a parsing framework for extensions of Tree Adjoining Grammars (TAG) called TuLiPA (Tübingen Linguistic Parsing Architecture). In particular, besides TAG, the parser can process Tree-Tuple MCTAG with shared nodes (TT-MCTAG), a TAG-extension that has been proposed to deal with scrambling in free word order languages such as German. The central strategy of the parser is su...
متن کاملLicensing and Tree Adjoining Grammar in Government Binding Parsing
This paper presents an implemented, psychologically plausible parsing model for Government Binding theory grammars. I make use of two main ideas: (1) a generalization of the licensing relations of [Abney, 1986] allows for the direct encoding of certain principles of grammar (e.g. Theta Criterion, Case Filter) which drive structure building; (2) the working space of the parser is constrained to ...
متن کاملFast LR parsing Using Rich (Tree Adjoining) Grammars
We describe an LR parser of parts-ofspeech (and punctuation labels) for Tree Adjoining Grammars (TAGs), that solves table conflicts in a greedy way, with limited amount of backtracking. We evaluate the parser using the Penn Treebank showing that the method yield very fast parsers with at least reasonable accuracy, confirming the intuition that LR parsing benefits from the use of rich grammars.
متن کامل